64 research outputs found
Combining multiscale features for classification of hyperspectral images: a sequence based kernel approach
Nowadays, hyperspectral image classification widely copes with spatial
information to improve accuracy. One of the most popular way to integrate such
information is to extract hierarchical features from a multiscale segmentation.
In the classification context, the extracted features are commonly concatenated
into a long vector (also called stacked vector), on which is applied a
conventional vector-based machine learning technique (e.g. SVM with Gaussian
kernel). In this paper, we rather propose to use a sequence structured kernel:
the spectrum kernel. We show that the conventional stacked vector-based kernel
is actually a special case of this kernel. Experiments conducted on various
publicly available hyperspectral datasets illustrate the improvement of the
proposed kernel w.r.t. conventional ones using the same hierarchical spatial
features.Comment: 8th IEEE GRSS Workshop on Hyperspectral Image and Signal Processing:
Evolution in Remote Sensing (WHISPERS 2016), UCLA in Los Angeles, California,
U.
Utiliser des machines à vecteurs de support pour approcher les contours d'une fonction valeur dans des problèmes d'atteinte de cible
Nous proposons d'utiliser un algorithme d'apprentissage particulier, les SVMs, pour résoudre des problèmes d'atteinte de cible en temps minimal. La cible correspond à un état souhaité, en sachant que le système se détériore dans une certaine région de l'espace (lorsqu'il transgresse un ensemble de contraintes de viabilité). Le bassin de capture au temps t correspond à l'ensemble des états qui peuvent atteindre la cible en un temps inférieur ou égal à t, sans quitter l'ensemble des contrainte de viabilité. Les frontières d'un bassin de capture au temps t correspondent alors aux contours d'une fonction valeur au temps t. Les bassins de capture peuvent alors être utilisés pour définir une fonction de contrôle qui permet au système d'atteindre la cible en un temps minimal. Nous proposons une nouvelle approche, basée sur les SVMs, qui permettent d'approcher les bassins de capture successifs et définissons une procédure de contrôle qui permet de contrôler le système afin qu'il atteigne la cible. Nous illustrons cette méthode sur un exemple simple : le problème de la voiture sur la colline
Combining multiple resolutions into hierarchical representations for kernel-based image classification
Geographic object-based image analysis (GEOBIA) framework has gained
increasing interest recently. Following this popular paradigm, we propose a
novel multiscale classification approach operating on a hierarchical image
representation built from two images at different resolutions. They capture the
same scene with different sensors and are naturally fused together through the
hierarchical representation, where coarser levels are built from a Low Spatial
Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels
are generated from a High Spatial Resolution (HSR) or Very High Spatial
Resolution (VHSR) image. Such a representation allows one to benefit from the
context information thanks to the coarser levels, and subregions spatial
arrangement information thanks to the finer levels. Two dedicated structured
kernels are then used to perform machine learning directly on the constructed
hierarchical representation. This strategy overcomes the limits of conventional
GEOBIA classification procedures that can handle only one or very few
pre-selected scales. Experiments run on an urban classification task show that
the proposed approach can highly improve the classification accuracy w.r.t.
conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis
(GEOBIA 2016), University of Twente in Enschede, The Netherland
Classification of MODIS Time Series with Dense Bag-of-Temporal-SIFT-Words: Application to Cropland Mapping in the Brazilian Amazon
International audienceMapping croplands is a challenging problem in a context of climate change and evolving agricultural calendars. Classification based on MODIS vegetation index time series is performed in order to map crop types in the Brazilian state of Mato Grosso. We used the recently developed Dense Bag-of-Temporal-SIFT-Words algorithm, which is able to capture temporal locality of the data. It allows the accurate detection of around 70% of the agricultural areas. It leads to better classification rates than a baseline algorithm, discriminating more accurately classes with similar profiles
Fast Optimal Transport through Sliced Wasserstein Generalized Geodesics
Wasserstein distance (WD) and the associated optimal transport plan have been
proven useful in many applications where probability measures are at stake. In
this paper, we propose a new proxy of the squared WD, coined min-SWGG, that is
based on the transport map induced by an optimal one-dimensional projection of
the two input distributions. We draw connections between min-SWGG and
Wasserstein generalized geodesics in which the pivot measure is supported on a
line. We notably provide a new closed form for the exact Wasserstein distance
in the particular case of one of the distributions supported on a line allowing
us to derive a fast computational scheme that is amenable to gradient descent
optimization. We show that min-SWGG is an upper bound of WD and that it has a
complexity similar to as Sliced-Wasserstein, with the additional feature of
providing an associated transport plan. We also investigate some theoretical
properties such as metricity, weak convergence, computational and topological
properties. Empirical evidences support the benefits of min-SWGG in various
contexts, from gradient flows, shape matching and image colorization, among
others.Comment: Main: 10 pages,4 Figures Tables Supplementary: 19 pages, 13 Figures
,1 Table. Sumbitted to Neurips 202
Match-And-Deform: Time Series Domain Adaptation through Optimal Transport and Temporal Alignment
While large volumes of unlabeled data are usually available, associated
labels are often scarce. The unsupervised domain adaptation problem aims at
exploiting labels from a source domain to classify data from a related, yet
different, target domain. When time series are at stake, new difficulties arise
as temporal shifts may appear in addition to the standard feature distribution
shift. In this paper, we introduce the Match-And-Deform (MAD) approach that
aims at finding correspondences between the source and target time series while
allowing temporal distortions. The associated optimization problem
simultaneously aligns the series thanks to an optimal transport loss and the
time stamps through dynamic time warping. When embedded into a deep neural
network, MAD helps learning new representations of time series that both align
the domains and maximize the discriminative power of the network. Empirical
studies on benchmark datasets and remote sensing data demonstrate that MAD
makes meaningful sample-to-sample pairing and time shift estimation, reaching
similar or better classification performance than state-of-the-art deep time
series domain adaptation strategies
Viability and Resilience of Languages in Competition
We study the viability and resilience of languages, using a simple dynamical model of two languages in competition. Assuming that public action can modify the prestige of a language in order to avoid language extinction, we analyze two cases: (i) the prestige can only take two values, (ii) it can take any value but its change at each time step is bounded. In both cases, we determine the viability kernel, that is, the set of states for which there exists an action policy maintaining the coexistence of the two languages, and we define such policies. We also study the resilience of the languages and identify configurations from where the system can return to the viability kernel (finite resilience), or where one of the languages is lead to disappear (zero resilience). Within our current framework, the maintenance of a bilingual society is shown to be possible by introducing the prestige of a language as a control variable
Apprentissage machine pour données structurées
With the aim of understanding environmental systems from Earth observation, it is essential to analyze complex remote sensing data by designing and leveraging on new machine learning techniques. These data are often structured in the sense that we face time series of remote sensing images that can be represented as graphs or hierarchical representations. This HDR manuscript presents some of my research works since I join Obelix team at IRISA at the end of 2012. Three axes are developed here. The first one deals with learning on high-resolution images for which dedicated tools are proposed to tackle the correlation between pixels but also the attributes of the pixels. The second part is about time series for which effective learning algorithms are proposed. Finally, leveraging on optimal transport, a new metric for comparing graphs is described. The specific case of outliers is also treated with the definition of dedicated algorithmic tools. For each of the research axes, perspectives are given, from the definition of learning algorithms in non-Euclidean spaces for hierarchical data, from learning from time series from different domains or from integrating the implicit structure present into the data within the learning algorithms.Dans un contexte de compréhension des systèmes environnementaux à partir d’observation de la Terre, il est indispensable d’analyser les données de télédétection complexes en définissant et en s’appuyant sur de nouvelles méthodes d’apprentissage machine. Ces données sont souvent structurées dans le sens où l’on dispose de séries temporelles d’images satellites qui peuvent être représentées sous forme de graphes ou de représentations hiérarchiques. Ce mémoire d’habilitation à diriger des recherches présente une synthèse de mes activités de recherche depuis mon intégration dans l’équipe Obelix de l’IRISA fin 2012. Trois axes y sont développés. Le premier traite de l’apprentissage sur les images à haute résolution pour lesquelles des outils d’apprentissage spécifique sont proposés pour prendre en compte la corrélation entre les pixels mais également entre les descripteurs des pixels. La seconde partie traite spécifiquement du cas des séries temporelles, pour lesquelles des solutions d’apprentissages efficaces sont proposées. En dernier lieu, et en s’appuyant sur des outils issus du transport optimal, une nouvelle métrique pour comparer des graphes est proposée. Le cas spécifique des données aberrantes est également traité avec la définition d’algorithmes de résolution dédiés. Pour chacun des axes de recherche, des perspectives sont présentées, de la définition d’algorithmes dans des espaces non Euclidiens pour les données hiérarchiques, de l’apprentissage à partir de séries temporelles issues de domaines différents ou de l’intégration de la structure implicite présente dans les données
Anomaly detection with score functions based on the reconstruction error of the kernel PCA
International audienceWe propose a novel non-parametric statistical test that allows the detection of anomalies given a set of (possibly high dimensional) sample points drawn from a nominal probability distribution. Our test statistic is the distance of a query point mapped in a feature space to its projection on the eigen-structure of the kernel matrix computed on the sample points. Indeed, the eigenfunction expansion of a Gram matrix is dependent on the input data density f0. The resulting statistical test is shown to be uniformly most powerful for a given false alarm level alpha when the alternative density is uniform over the support of the null distribution. The algorithm can be computed in O(n^3 + n^2) and testing a query point only involves matrix vector products. Our method is tested on both artificial and benchmarked real data sets and demonstrates good performances w.r.t. competing methods
Un test statistique pour la détection d'anomalies basé sur l'erreur de reconstruction de l'ACP à noyau
International audienceAnomaly detection aims at declaring a query point as "normal" or not with respect to a nominal model. A non-parametric statistical test that allows the detection of anomalies given a set of (possibly high dimensional) sample points drawn from a nominal probability distribution is presented. Its test statistic is based on the distance between a query point, mapped in a feature space, and its projection on the eigen-structure of the kernel matrix computed on the sample points. The method is tested on both articial and benchmarked real data sets and demonstrates good performances regarding both type-I and type-II errors w.r.t. competing methods. This communication is based on a recently published paper by the same authors.La détection d'anomalies concerne l'identification de points dont le comportement dévie d'un modèle dit nominal. On présente ici un test statistique non paramétrique pour la détection d'anomalies dont la statistique de test est basée sur la distance entre un point à tester, représenté dans un espace de redescription, et sa projection sur l'espace généré par une Analyse en Composantes Principales à noyau réalisée sur un échantillon tiré d'une loi de probabilité nominale. La méthode est testée sur des données réelles et artificielles, et montre de bonnes performances en ce qui concerne à la fois de l'erreur de type-I et de type-II par rapport à des méthodes usuelles de détection d'anomalies. Cette communication est basée sur un article récemment publié par les mêmes auteurs
- …